Search CORE

165 research outputs found

Randomness in neural networks: an overview

Author: Scardapane S.
Wang D.
Publication venue: 'Wiley'
Publication date: 01/01/2017
Field of study

Neural networks, as powerful tools for data mining and knowledge engineering, can learn from data to build feature-based classifiers and nonlinear predictive models. Training neural networks involves the optimization of nonconvex objective functions, and usually, the learning process is costly and infeasible for applications associated with data streams. A possible, albeit counterintuitive, alternative is to randomly assign a subset of the networks’ weights so that the resulting optimization task can be formulated as a linear least-squares problem. This methodology can be applied to both feedforward and recurrent networks, and similar techniques can be used to approximate kernel functions. Many experimental results indicate that such randomized models can reach sound performance compared to fully adaptable ones, with a number of favorable benefits, including (1) simplicity of implementation, (2) faster learning with less intervention from human beings, and (3) possibility of leveraging overall linear regression and classification algorithms (e.g., ℓ 1 norm minimization for obtaining sparse formulations). This class of neural networks attractive and valuable to the data mining community, particularly for handling large scale data mining in real-time. However, the literature in the field is extremely vast and fragmented, with many results being reintroduced multiple times under different names. This overview aims to provide a self-contained, uniform introduction to the different ways in which randomization can be applied to the design of neural networks and kernel functions. A clear exposition of the basic framework underlying all these approaches helps to clarify innovative lines of research, open problems, and most importantly, foster the exchanges of well-known results throughout different communities. WIREs Data Mining Knowl Discov 2017, 7:e1200. doi: 10.1002/widm.1200

Archivio della ricerca- Università di Roma La Sapienza

Deep Expander Networks: Efficient Deep Networks from Graph Theory

Author: H Zhou
K He
M Masana
M Rastegari
S Hoory
S Scardapane
SP Vadhan
Publication venue
Publication date: 26/07/2018
Field of study

Efficient CNN designs like ResNets and DenseNet were proposed to improve accuracy vs efficiency trade-offs. They essentially increased the connectivity, allowing efficient information flow across layers. Inspired by these techniques, we propose to model connections between filters of a CNN using graphs which are simultaneously sparse and well connected. Sparsity results in efficiency while well connectedness can preserve the expressive power of the CNNs. We use a well-studied class of graphs from theoretical computer science that satisfies these properties known as Expander graphs. Expander graphs are used to model connections between filters in CNNs to design networks called X-Nets. We present two guarantees on the connectivity of X-Nets: Each node influences every node in a layer in logarithmic steps, and the number of paths between two sets of nodes is proportional to the product of their sizes. We also propose efficient training and inference algorithms, making it possible to train deeper and wider X-Nets effectively. Expander based models give a 4% improvement in accuracy on MobileNet over grouped convolutions, a popular technique, which has the same sparsity but worse connectivity. X-Nets give better performance trade-offs than the original ResNet and DenseNet-BC architectures. We achieve model sizes comparable to state-of-the-art pruning techniques using our simple architecture design, without any pruning. We hope that this work motivates other approaches to utilize results from graph theory to develop efficient network architectures.Comment: ECCV'1

arXiv.org e-Print Archive

Crossref

A meta-learning approach for training explainable graph neural networks

Author: Scardapane S.
Spinelli I.
Uncini A.
Publication venue: Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2022
Field of study

In this article, we investigate the degree of explainability of graph neural networks (GNNs). The existing explainers work by finding global/local subgraphs to explain a prediction, but they are applied after a GNN has already been trained. Here, we propose a meta-explainer for improving the level of explainability of a GNN directly at training time, by steering the optimization procedure toward minima that allow post hoc explainers to achieve better results, without sacrificing the overall accuracy of GNN. Our framework (called MATE, MetA-Train to Explain) jointly trains a model to solve the original task, e.g., node classification, and to provide easily processable outputs for downstream algorithms that explain the model's decisions in a human-friendly way. In particular, we meta-train the model's parameters to quickly minimize the error of an instance-level GNNExplainer trained on-the-fly on randomly sampled nodes. The final internal representation relies on a set of features that can be ``better'' understood by an explanation algorithm, e.g., another instance of GNNExplainer. Our model-agnostic approach can improve the explanations produced for different GNN architectures and use any instance-based explainer to drive this process. Experiments on synthetic and real-world datasets for node and graph classification show that we can produce models that are consistently easier to explain by different algorithms. Furthermore, this increase in explainability comes at no cost to the accuracy of the model

Archivio della ricerca- Università di Roma La Sapienza

Compressing deep-quaternion neural networks with targeted regularisation

Author: Comminiello D.
Scardapane S.
Uncini A.
Vecchi R.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2020
Field of study

In recent years, hyper-complex deep networks (such as complex-valued and quaternion-valued neural networks - QVNNs) have received a renewed interest in the literature. They find applications in multiple fields, ranging from image reconstruction to 3D audio processing. Similar to their real-valued counterparts, quaternion neural networks require custom regularisation strategies to avoid overfitting. In addition, for many real-world applications and embedded implementations, there is the need of designing sufficiently compact networks, with few weights and neurons. However, the problem of regularising and/or sparsifying QVNNs has not been properly addressed in the literature as of now. In this study, the authors show how to address both problems by designing targeted regularisation strategies, which can minimise the number of connections and neurons of the network during training. To this end, they investigate two extensions of l1and structured regularisations to the quaternion domain. In the authors' experimental evaluation, they show that these tailored strategies significantly outperform classical (realvalued) regularisation approaches, resulting in small networks especially suitable for low-power and real-time applications

Archivio della ricerca- Università di Roma La Sapienza

Re-identification of objects from aerial photos with hybrid siamese neural networks

Author: Chiovoloni F.
Devoto A.
Murabito F.
Musmeci R.
Scardapane S.
Spinelli I.
Publication venue: IEEE Computer Society
Publication date: 01/01/2022
Field of study

In this paper, we consider the task of re-identifying the same object in different photos taken from separate positions and angles during aerial reconnaissance, which is a crucial task for the maintenance and surveillance of critical large-scale infrastructure. To effectively hybridize deep neural networks with available domain expertise for a given scenario, we propose a customized pipeline, wherein a domain-dependent object detector is trained to extract the assets (i.e., sub-components) present on the objects, and a siamese neural network learns to re-identify the objects, exploiting both visual features (i.e., the image crops corresponding to the assets) and the graphs describing the relations among their constituting assets. We describe a real-world application concerning the re-identification of electric poles in the Italian energy grid, showing our pipeline to significantly outperform siamese networks trained from visual information alone. We also provide a series of ablation studies of our framework to underline the effect of including topological asset information in the pipeline, learnable positional embeddings in the graphs, and the effect of different types of graph neural networks on the final accuracy

Archivio della ricerca- Università di Roma La Sapienza

CASTNet: Community-Attentive Spatio-Temporal Networks for Opioid Overdose Forecasting

Author: A Kennedy-Hendricks
A Kolodny
AM Ertugrul
DS Burke
H Jalal
M Pierce
M Warner
NB King
PJ Gruenewald
R Hammersley
RA Rudd
S Hochreiter
S Scardapane
T Bennett
T Seddon
Publication venue
Publication date: 20/09/2019
Field of study

Opioid overdose is a growing public health crisis in the United States. This crisis, recognized as "opioid epidemic," has widespread societal consequences including the degradation of health, and the increase in crime rates and family problems. To improve the overdose surveillance and to identify the areas in need of prevention effort, in this work, we focus on forecasting opioid overdose using real-time crime dynamics. Previous work identified various types of links between opioid use and criminal activities, such as financial motives and common causes. Motivated by these observations, we propose a novel spatio-temporal predictive model for opioid overdose forecasting by leveraging the spatio-temporal patterns of crime incidents. Our proposed model incorporates multi-head attentional networks to learn different representation subspaces of features. Such deep learning architecture, called "community-attentive" networks, allows the prediction of a given location to be optimized by a mixture of groups (i.e., communities) of regions. In addition, our proposed model allows for interpreting what features, from what communities, have more contributions to predicting local incidents as well as how these communities are captured through forecasting. Our results on two real-world overdose datasets indicate that our model achieves superior forecasting performance and provides meaningful interpretations in terms of spatio-temporal relationships between the dynamics of crime and that of opioid overdose.Comment: Accepted as conference paper at ECML-PKDD 201

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Recommended from our members

Learning Speech Emotion Representations in the Quaternion Domain

Author: Comminiello D.
Guizzo E.
Scardapane S.
Weyde T.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2023
Field of study

The modeling of human emotion expression in speech signals is an important, yet challenging task. The high resource demand of speech emotion recognition models, combined with the general scarcity of emotion-labelled data are obstacles to the development and application of effective solutions in this field. In this paper, we present an approach to jointly circumvent these difficulties. Our method, named RH-emo, is a novel semi-supervised architecture aimed at extracting quaternion embeddings from real-valued monoaural spectrograms, enabling the use of quaternion-valued networks for speech emotion recognition tasks. RH-emo is a hybrid real/quaternion autoencoder network that consists of a real-valued encoder in parallel to a real-valued emotion classifier and a quaternion-valued decoder. On the one hand, the classifier permits to optimization of each latent axis of the embeddings for the classification of a specific emotion-related characteristic: valence, arousal, dominance, and overall emotion. On the other hand, quaternion reconstruction enables the latent dimension to develop intra-channel correlations that are required for an effective representation as a quaternion entity. We test our approach on speech emotion recognition tasks using four popular datasets: IEMOCAP, RAVDESS, EmoDB, and TESS, comparing the performance of three well-established real-valued CNN architectures (AlexNet, ResNet-50, VGG) and their quaternion-valued equivalent fed with the embeddings created with RH-emo. We obtain a consistent improvement in the test accuracy for all datasets, while drastically reducing the resources' demand of models. Moreover, we performed additional experiments and ablation studies that confirm the effectiveness of our approach

City Research Online

Group sparse regularization for deep neural networks

Author: Comminiello D.
Hussain A.
Scardapane S.
Uncini A.
Publication venue: Elsevier
Publication date: 02/07/2016
Field of study

In this paper, we address the challenging task of simultaneously optimizing (i) the weights of a neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are traditionally dealt with separately, we propose an efficient regularized formulation enabling their simultaneous parallel execution, using standard optimization routines. Specifically, we extend the group Lasso penalty, originally proposed in the linear regression literature, to impose group-level sparsity on the network’s connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We carry out an extensive experimental evaluation, in comparison with classical weight decay and Lasso penalties, both on a toy dataset for handwritten digit recognition, and multiple realistic mid-scale classification benchmarks. Comparative results demonstrate the potential of our proposed sparse group Lasso penalty in producing extremely compact networks, with a significantly lower number of input features, with a classification accuracy which is equal or only slightly inferior to standard regularization terms

arXiv.org e-Print Archive

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

Repository@Napier

Archivio della ricerca- Università di Roma La Sapienza

Effects of photoperiod on epididymal and sperm morphology in a wild rodent, the viscacha (Lagostomus maximus maximus)

Author: Aguilera Merlo C.
Chaves E. M.
Cruceño A. M.
De Rosas J. C.
Dominguez S.
Foscolo Mabel Rosa
Scardapane L.
Publication venue: 'Hindawi Limited'
Publication date: 02/12/2012
Field of study

The viscacha (Lagostomus maximus maximus) is a seasonal South American wild rodent. The adult males exhibit an annual reproductive cycle with periods of maximum and minimum gonadal activity. Four segments have been identified in the epididymis of this species: initial, caput, corpus, and cauda. The main objective of this work was to relate the seasonal morphological changes observed in the epididymal duct with the data from epididymal sperm during periods of activity and gonadal regression using light and scanning electron microscopy (SEM). Under light and electron microscopy, epididymal corpus and cauda showed marked seasonal variations in structural parameters and in the distribution of different cellular populations of epithelium. Initial and caput segments showed mild morphological variations between the two periods. Changes in epididymal sperm morphology were observed in the periods analyzed and an increased number of abnormal gametes were found during the regression period. During this period, anomalies were found mainly in the head, midpiece, and neck, while in the activity period, defects were found only in the head. Our results confirm that the morphological characteristics of the epididymal segments, as well as sperm morphology, undergo significant changes during the reproductive cycle of Lagostomus.Fil: Cruceño, A. M.. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia. Cátedra de Histología; ArgentinaFil: De Rosas, J. C.. Consejo Nacional de Investigaciones Científicas y Tecnicas. Centro Cientifico Tecnologico Mendoza. Instituto Histologia y Embriologia de Mendoza "Dr. M. Burgos"; Argentina. Universidad Nacional de Cuyo. Facultad de Ciencias Médicas; ArgentinaFil: Foscolo, Mabel Rosa. Consejo Nacional de Investigaciones Científicas y Tecnicas. Centro Cientifico Tecnologico Mendoza. Instituto Histologia y Embriologia de Mendoza "Dr. M. Burgos"; ArgentinaFil: Chaves, E. M.. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia. Cátedra de Histología; ArgentinaFil: Scardapane, L.. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia. Cátedra de Histología; ArgentinaFil: Dominguez, S.. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia. Cátedra de Histología; ArgentinaFil: Aguilera Merlo, C.. Universidad Nacional de San Luis. Facultad de Química, Bioquímica y Farmacia. Cátedra de Histología; Argentin

CONICET Digital

PubMed Central

Multifunctional Core@Satellite Magnetic Particles for Magnetoresistive Biosensors

Author: Acunzo A.
Campanile R.
Cardoso S.
Della Ventura B.
Di Girolamo R.
Iannotti V.
Martins V. C.
Minopoli A.
Scardapane E.
Velotta R.
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2022
Field of study

Magnetoresistive (MR) biosensors combine distinctive features such as small size, low cost, good sensitivity, and propensity to be arrayed to perform multiplexed analysis. Magnetic nanoparticles (MNPs) are the ideal target for this platform, especially if modified not only to overcome their intrinsic tendency to aggregate and lack of stability but also to realize an interacting surface suitable for biofunctionalization without strongly losing their magnetic response. Here, we describe an MR biosensor in which commercial MNP clusters were coated with gold nanoparticles (AuNPs) and used to detect human IgG in water using an MR biochip that comprises six sensing regions, each one containing five U-shaped spin valve sensors. The isolated AuNPs (satellites) were stuck onto an aggregate of individual iron oxide crystals (core) so that the resulting core@satellite magnetic particles (CSMPs) could be functionalized by the photochemical immobilization technique an easy procedure that leads to oriented antibodies immobilized upright onto gold. The morphological, optical, hydrodynamic, magnetic, and surface charge properties of CSMPs were compared with those exhibited by the commercial MNP clusters showing that the proposed coating procedure endows the MNP clusters with stability and ductility without being detrimental to magnetic properties. Eventually, the high-performance MR biosensor allowed us to detect human IgG in water with a detection limit of 13 pM (2 ng mL-1). Given its portability, the biosensor described in this paper lends itself to a point-of-care device; moreover, the features of the MR biochip also make it suitable for multiplexed analysis

Archivio della ricerca - Università degli studi di Napoli Federico II